Chemical and Enzymatic Methods for Post-Translational Protein–Protein Conjugation

Fusion proteins play an essential role in the biosciences but suffer from several key limitations, including the requirement for N-to-C terminal ligation, incompatibility of constituent domains, incorrect folding, and loss of biological activity. This perspective focuses on chemical and enzymatic approaches for the post-translational generation of well-defined protein–protein conjugates, which overcome some of the limitations faced by traditional fusion techniques. Methods discussed range from chemical modification of nucleophilic canonical amino acid residues to incorporation of unnatural amino acid residues and a range of enzymatic methods, including sortase-mediated ligation. Through summarizing the progress in this rapidly growing field, the key successes and challenges associated with using chemical and enzymatic approaches are highlighted and areas requiring further development are discussed.


INTRODUCTION
Protein−protein conjugates are biomolecules generated from two or more protein domains. The ability to place proteins with mutually exclusive functions in the same location at the same time has the potential to yield properties that would otherwise be impossible to achieve when compared to using component protein monomers in isolation. These biomolecules have a diverse range of applications in the fields of biotechnology and biopharmaceutical research. Nature has evolved numerous post-translational protein−protein conjugates with a key example being the covalent conjugation of multiple ubiquitin subunits to protein substrates, tagging them for degradation by the ubiquitin−proteasome pathway, and in turn regulating cellular processes or clearing aberrant proteins. 1 Post-translational ubiquitination has been particularly wellstudied (Figure 1), and therefore synthetic methods for ubiquitination and subsequent applications are not discussed further herein. 2−4 An indispensable method for generating protein−protein conjugates has been via so-called "fusion proteins", generated by translation of a designed DNA sequence. The products of this technology have found applications in protein purification, imaging, and in the production of bifunctional engineered enzymes and bispecific antibodies. 5−9 Genetic fusion proteins are produced by expression of a gene and result in a single polypeptide chain. The "linker" is the portion of the polypeptide chain that resides between the two protein domains, and its physical characteristics are determined by its constituent amino acids; properties such as flexibility, length, or ability to be cleaved in vivo can be tuned by changing the linker. 10 It is clear that genetic fusion is a powerful technique for generating protein−protein conjugates, as evidenced by their diverse range of applications. Nonetheless, there are some key limitations faced when using these approaches, which have the potential to be overcome using alternative, post-translational, or "synthetic" conjugation methods. Limitations of recombinantly expressed fusion proteins can include poor yields, incorrect folding, poor stability, and the restrictive necessity of N-to-C terminal fusion, which is particularly challenging in cases when free termini are required to retain biological activity, an area where chemical and enzymatic processes can play a key role. 5,11,12 In addition, recombinant expression of certain fusion proteins is not feasible, as they may require separate cell lines for expression, as is the case with some immunotoxin conjugates. 13 By obviating the need for genetic fusion, each protein domain can be expressed independently and subsequently ligated using chemical or biochemical conjugation strategies (Figure 1). In line with the rapidly expanding toolbox for site-selective (targeting a single type of amino acid residue) and site-specific (targeting a single amino acid residue over all others in the protein) protein modification, methods for synthetically generating protein−protein conjugates have seen significant advancements. 14−17

THE PROTEIN−PROTEIN COUPLING PROBLEM
The biggest challenge facing the preparation of protein− protein conjugates is one of kinetics. In traditional bioconjugation reactions between a protein and a small molecule, a common strategy is to use a high stoichiometric excess of the latter in order to increase reaction velocity. This is necessary because proteins are generally present in low concentrations (1−100 μM) and are also large in size, rendering them sterically encumbered coupling partners. 18,19 However, in the case of protein−protein coupling reactions, it is generally not practical to use a larger stoichiometric excess of one partner. The protein−protein coupling problem arises because two of these sterically encumbered coupling partners, both present at low concentrations, must come together to form the desired protein−protein conjugate.
Naturally, this problem has been addressed by using reactions that have high second-order rate constants (k 2 ). 20 Therefore, a common theme is the inclusion of functional groups for "click chemistry", which can themselves be introduced using several different methods. A recent survey and comparison of various click partners found that the use of endo-bicyclononyne and methyltetrazine partners in an inverse electron demand Diels−Alder (IEDDA) cycloaddition was most effective (k 2 = 70 M −1 s −1 ) and therefore these partners might be suitable for the first iteration of any click-mediated strategy. 21 Two other general approaches have arisen to solve this problem, both of which aim to effectively increase the local concentration of the protein coupling partners. The first is to use two proteins of opposing charge (i.e., isoelectric points (pIs) either side of 7, Figure 3) in order to bring them into contact via electrostatic interactions. 22 The second involves producing proteins with an affinity for a surface; this was achieved with a protein bearing a His 6 -tag which binds to an agarose surface displaying Ni(II). 23 Although this approach was actually used for coupling of proteins to a liposome bearing Gly 3 motifs, the concept should be applicable to protein−protein conjugation.
Despite the protein−protein coupling problem, several strategies have emerged for preparing protein−protein conjugates which are summarized herein. Methods encountered rely on chemical modification of native side chains (Section 3), incorporation of unnatural amino acids (Section 4), or the use of enzymatic reactions or sequence tags (Section 5).

TARGETING CANONICAL AMINO ACIDS
Attempts to generate protein−protein conjugates using chemical conjugation strategies have been pursued for over half a century. 24 Early methods using bifunctional chemical reagents relied heavily on the inherent nucleophilicity of cysteine residues. Approaches included the use of reagents such as bifunctional cysteine-selective organomercury reagents to generate sulfur−mercury linkages between proteins. 25 However, a more recognizable cysteine targeting strategy, which remains extremely popular to this day, is homobifunctional bismaleimide reagents, used for the conjugation of proteins through reduced cysteine thiols. 26 Amino acids that contain nucleophilic amine or hydroxyl side chains were also exploited in early protein−protein conjugation strategies. 24 Amine reactive bifunctional reagents incorporating functionalities such as diisocyanates, α,ωdialdehydes including glutaraldehyde, and halonitrobenzenes, which also react with histidine and the hydroxyl groups of tyrosine, have been exploited in protein−protein conjugation. 27−29 In addition, carbodiimides were used to cross-link carboxylic acids and free amino groups of different protein domains, while imidoesters were used to cross-link amine groups, including those found in lysine residues. 30−32 A range of less residue specific, general nucleophile targeting, cross-linking strategies were also developed to generate protein−protein conjugates. These include bisepoxide, striazine, and aziridine functionalities and are discussed in a comprehensive review on cross-linking strategies. 24 3.1. Cysteine-Targeting Reagents. Although many of the early strategies that targeted native residues successfully produced the desired protein−protein conjugates, they lacked specificity, resulting in conjugation through multiple residues on each protein. This drawback, coupled with the advent of powerful genetic engineering technology in the 1990s, meant interest in chemical generation of protein−protein conjugates did not endure. However, recent advances in both site-selective and site-specific protein modification strategies that exhibit exquisite control have prompted a resurgence in the pursuit of chemically linked protein−protein conjugates. There remains an overwhelming preference for targeting nucleophilic amino acid residues, in particular cysteine, due to its highly selective reactivity profile and low natural abundance. 33, 34 3.1.1. Single-Residue-Targeting Homobifunctional Reagents. Homobifunctional linking strategies rely on symmetric molecules with an identical reactive functionality on both ends of a linker. These homobifunctional molecules target the same amino acid residue on each of the protein domains being conjugated, although the specific environment does not necessarily have to be identical. These can be used to generate homo-or heterodimeric protein−protein conjugates in one pot or sequential reaction protocols, respectively ( Figure 2).
A popular homobifunctional linking approach exploits one of the most ubiquitous conjugation strategies used in chemical biology; cysteine-maleimide conjugation ( Figure 2). 22,35−38 To achieve conjugation, an odd number of exposed cysteine residues are required to allow one end of the homobifunctional molecule to remain unconjugated and present a reactive handle to which a second protein can be conjugated. Conjugation of mouse monoclonal and rabbit polyclonal hinge cysteinecontaining antigen binding fragments (Fab′), both containing three cysteine residues in their hinge region, was performed with an ortho-phenylenedimaleimide linker. 35 This method was also used to prepare mouse−mouse and mouse−rabbit Fab′ bispecific antibodies upon addition of a second cysteinecontaining Fab′. However, the even number of disulfide bonds found in the hinge region of human immunoglobulin G antibodies (IgGs) makes this approach incompatible with human Fab′ dimerization and therefore less therapeutically relevant. 35 A variation of this approach used antigen binding fragment (Fab) re-engineering to introduce a reactive unpaired cysteine into a Fab and followed by dimerization with homobifunctional bismaleimide reagents. 36 To achieve this, recombinantly expressed Fabs containing engineered cysteine residues, termed thio-Fabs, were conjugated using a bismaleimide coupling reagent with a polyethylene glycol (PEG) linker to form a bis-thio-Fab heterodimeric species named biFabs. Heterodimerization was achieved via the addition of an excess of bismaleimide to the initial thio-Fab, generating thio-Fabs presenting electrophilic maleimide handles. These could subsequently undergo conjugation to a second thio-Fab domain.
This particular study focused on thio-Fabs generated from the human epidermal growth factor receptor 2 (HER2)targeting antibody, trastuzumab. All thio-Fabs conjugated in this manner varied only with respect to the location of the engineered cysteine in the Fab domain. All biFabs were therefore by definition monospecific, but could elicit extremely different biological responses. Depending on the orientation of the variable fragment (Fv) regions of the biFabs, the conjugates could either promote or inhibit breast tumor cell growth. 36 This study highlights the impact that chemically conjugating protein domains at predefined internal sites without relying on N-to-C terminal ligation can have on biological properties.
Beyond bispecific antibody production, homobifunctional bismaleimide reagents were used to explore the effect of global protein charge in the one-pot dimerization of high molecular weight proteins. 22 Treatment of two proteins of opposing net charges (bovine serum albumin (BSA), pI = 4.7 and cytochrome c, pI = 10.6) with a bismaleimide reagent provided the corresponding heterodimer in yields of up to 30% ( Figure  3). In contrast, two proteins of similar charges (cytochrome c, pI = 10.6 and GFP, pI = 8.3) under the same conditions gave the corresponding heteroconjugate in <1% yield, clearly demonstrating the importance of the physicochemical properties of precursors for protein−protein conjugation.
Although fast and accessible, maleimide conjugation strategies suffer from well-documented drawbacks, in particular the susceptibility of conjugates to undergo retro-Michael addition under physiological conditions. 37,39 This characteristic is suboptimal for biologics, and therefore, alternative conjugation approaches exhibiting enhanced stability under physiological conditions have been the subject of much investigation. One such example is the use of S-alkynyl sulfonium reagents, which generate stable ubiquitin−ubiquitin homodimers ( Figure 2). 40 3.1.2. Disulfide Rebridging Homobifunctional Reagents. Homobifunctional reagents based on rebridging of disulfide bonds in Fab and single chain variable fragment (scFv) antibody domains have also been employed to generate bispecific antibodies. These techniques are an accessible method for generating antibody dimers, as the constituent scFv domains can be easily acquired.
The development of a reagent that features two bis-sulfone groups for disulfide rebridging at either end of a PEG linker led to the preparation of Fab-PEG-Fab conjugates with comparable or better binding and in vitro efficacy than their corresponding parent IgGs, which target HER2 and vascular endothelial growth factor (VEGF) (Figure 2). 41 This method was limited to the generation of homodimeric conjugates, and relatively low yields were achieved (18%). However, it was reported that the resulting dimers maintained their activity after storage at 4°C for six months, highlighting the stability of the linkage generated by this disulfide rebridging approach.
Heterodimeric Fab-scFv conjugates were prepared using "next-generation maleimide" reagents that feature halogens on the sp 2 carbons of the maleimide functional group. In this case, the reagent featured two 2,3-dibromomaleimide (DBM) reactive groups at either end of a PEG linker and was employed in an analogous manner to the bis-sulfone approach described previously (Figure 2). 42 Yields of up to 52% were achieved in the production of heterodimeric conjugates using a  sequential addition strategy. This strategy was subsequently improved using a more reactive and hydrolytically stable 2,3diiodomaleimide (DIM) species. 43 Exploiting the slower rate of DIM hydrolysis compared to DBM allowed more sterically hindered systems such as trimeric scFv formats and human serum albumin (HSA)-scFv or Fab conjugates to be produced. This was achieved by overcoming the competing hydrolysis of DBM to unreactive dibromomaleamic acid, allowing more sterically hindered thiols to react. 42,43 Upon hydrolysis of DIM, serum stable maleamic acid conjugates were generated. However, incubation at 37°C for up to 72 h was required for complete hydrolysis. Therefore, the development of conjugation strategies which directly form stable products, without the need for hydrolysis, may be beneficial to avoid extended incubation times. 43 In general, homobifunctional disulfide linking approaches are advantageous because the corresponding dimers can be generated from any Fab which can be produced enzymatically from commercially available therapeutic antibodies. Conceptually, any disulfide rebridging reagent that can be placed at either end of a linker could be utilized to achieve similar effects to those described. 44 This approach is therefore accessible to researchers without facilities for protein engineering and expression.
3.1.3. Click Handle Installation at Cysteine. As one of the most ubiquitously exploited classes of bioorthogonal reactions, click chemistry has been widely utilized to generate protein− protein conjugates. The most commonly used reactions include those between terminal or strained alkynes with azides or tetrazines, in the presence or absence of Cu(I), depending on the specific reactive partners chosen. 45 Once a bioorthogonal pair of components for click chemistry has been selected, they can be installed on proteins using one of several bioconjugation strategies.
Early attempts to generate protein−protein conjugates via click-based methods used Cu(I)-catalyzed azide−alkyne cycloaddition (CuAAC) between proteins bearing these two functionalities ( Figure 4). 46−50 One example of generating protein−protein conjugates in this way was to install the alkyne and azide groups at cysteine residues via bromoacetamide conjugation, generating di-scFvs upon dimerization via CuAAC. 46 After conjugation of a reagent featuring a terminal trialkyne moiety in place of a monoalkyne derivative, an improvement in conversion from 33% to 74% was observed and this was attributed to an increased effective concentration of alkyne. The binding to the Mucin-1 peptide, prostate, and breast cancer cell lines was up to four times higher for the dimers compared to the parent scFv fragments. Similar results were also observed in subsequent CuAAC-mediated conjugation of di-scFvs, successfully generating multivalent conjugates. 47 Approaches using CuAAC were also used to produce cross-linked hemoglobin 48,49 and BSA-lipase heterodimers. 50 Alternative click-based methods which do not require Cu(I) catalysis have been explored, including strain-promoted azide− alkyne cycloaddition (SPAAC) and strain-promoted IEDDA cycloaddition in which strained unsaturated systems such as dibenzocyclooctyne (DBCO), trans-cycloctene (TCO), or bicyclo[6.1.0]nonyne (BCN) react with either azide or tetrazine moieties ( Figure 4). 45 In the case of SPAAC conjugation, an azide undergoes a click reaction with a strained alkyne such as DBCO, to generate protein−protein conjugates. 13,51−53 SPAAC conjugation enabled the dimerization of antiprostate-specific membrane antigen (PSMA) and anti-cluster of differentiation 3 (CD3) Fab fragments to generate bispecific T cell engaging antibodies against prostate cancer. 51 DBCO handles were installed on Fab fragments via a disulfide reduction and rebridging approach with heterofunctional dibromomaleimide molecules, in a conceptually analogous method to the aforementioned disulfide rebridging homobifunctional linking strategies. 42 Subsequently, a PEG bis-azide reagent was used to introduce a surface exposed azide moiety which could in turn react with a second DBCO-Fab fragment to generate bispecific antibodies. The bispecific antibody produced using this method displayed high potency in the picomolar range against PSMA-expressing prostate cancer cell lines and selectively bound the respective antigens of the constituent Fab fragments (PSMA and CD3), indicating that the conditions used in SPAAC conjugation were sufficiently mild to conserve the biological activity of the parent domains. This approach was also demonstrated on full length IgGs by installing azide and DBCO handles in the hinge region of anti-HER2 and antiepidermal growth factor receptor (EGFR) antibodies to generate a full length bispecific antibody which retained the potency of its constituent domains. 52 Another study installed azide and DBCO handles, using bromoacetamide reagents to introduce these click partners at reduced disulfide cysteine residues in the hinge region of anti-CD3 and anticarcinoembryonic antigen (CEA) antibodies. 53 This approach generated bispecific T cell engagers (BiTEs) based on full length antibodies termed, dual-specific, bivalent BiTEs. These dual-specific, bivalent BiTEs were shown to successfully redirect T cells to kill CEA + cells in transgenic mice. Use of cyclopropenone-based reagents, which react selectively with Nterminal cysteine residues over other internal cysteine residues, were used to produce N-to-N terminally conjugated dimers of a de novo designed mimic of the IL2 cytokine. 54 IEDDA click chemistry is another tool available for generating protein−protein conjugates typically employing the strained molecules TCO or BCN, in conjunction with tetrazine moieties. Using this click chemistry approach, dimerization of T4 lysozyme resulted in an 8-fold improvement in yield compared to a bismaleimide homobifunctional strategy, with yields of 38% and 5%, respectively. 55 A homobifunctional click reagent featuring potassium acyltrifluoroborate (KAT) groups connected by a PEG linker was used to generate homodimers of cysteine-containing T4 lysozyme and superfolder GFP (sfGFP) mutants. 56 The latter conjugation strategy introduced hydroxylamine functional groups at engineered surface exposed cysteine residues using cysteine-selective bifunctional molecules featuring a methylsulfonephenyl-oxadiazole-hydroxylamine motif. The hydroxylamine functionalized proteins were found to rapidly react with homobifunctional KAT reagents, giving up to 72% conversion after 5 h. However, it should be noted that the reaction was found to proceed most efficiently at pH 3.6. The necessity for highly acidic conditions could make this approach less compatible with sensitive protein domains.
The orthogonal nature of IEDDA and CuAAC click chemistry was exploited to generate bispecific antibodies which were subsequently dual-functionalized with two different payloads ( Figure 5). 57 Disulfide rebridging of two Fab fragments was carried out with bifunctional pyridazinedione reagents functionalized with either the strained alkyne BCN or a tetrazine moiety, respectively. These two orthogonally labeled conjugates were subsequently coupled using an IEDDA reaction to generate bivalent and bispecific antibodies from trastuzumab, rituximab, and cetuximab, which target HER2, CD20, and EGFR receptors, respectively. Since both nitrogen atoms of the pyridazinedione can be functionalized, a second orthogonally reactive terminal alkyne moiety was introduced. Upon generation of the conjugates, CuAAC chemistry was carried out to generate bispecific antibodies labeled with Alexa Flour 488 in 56% yield. By prelabeling the tetrazine-bearing fragment with azide-bearing sulfo-Cy5.5 dye, a dual-labeled antibody was generated by ligating a second dye, in the form of an azide-bearing Alexa Flour 488, after bispecific antibody generation with an impressive yield of 55%. In doing so, the authors demonstrated the potential for labeling of chemically generated bispecific antibodies with well-defined conjugation patterns of one, two of the same, or two unique payloads. Although not explored in this study, this approach could in theory be extended to drug payloads, thus producing a dual payload bispecific antibody−drug conjugate. 58

Cysteine Reactive Heterobifunctional
Reagents. An alternative class of compounds used in protein−protein conjugation are heterobifunctional reagents. These comprise a small molecule reagent containing two orthogonally reactive moieties connected by a linker. Due to the challenges associated with targeting two distinct amino acid residues on separate protein domains, while avoiding intramolecular crosslinking, this approach is challenging to execute. However, a number research groups have managed to overcome these challenges using orthogonally reactive functionalities with high specificity. 59−61 One approach exploited with heterobifunctional reagents is to protect one reactive functionality during the first conjugation step. Using a bis-sulfone functionality, discussed in Section 3.1.2, combined with a maleimide functionality, an orthogonal cysteine-selective heterobifunctional reagent controlled by a pH switch was developed. 59 At pH 6, upon addition of an excess of heterobifunctional reagent, the maleimide functional group is sufficiently reactive to undergo conjugation at cysteine on the first protein domain. After dialysis and increasing the pH to between 7−8, the bis-sulfone functional group underwent E1cB elimination of p-toluene sulfinic acid thus generating a reactive Michael acceptor moiety on the protein which reacted with the second cysteinecontaining protein. This enabled the preparation of HSA−BSA heterodimers in 10% yield. Although pH switching is an interesting concept, an analogous product might have been achieved with a bismaleimide conjugation strategy without the need for pH switching.
Another strategy using heterobifunctional molecules relies upon the activation of a functional group toward a second conjugation step only after it has undergone reaction with the first protein domain. In one example, a vinylphosphonite reagent was used to generate an electrophilic, cysteine-reactive, vinylphosphonothiolate handle at a modified cysteine ( Figure  6). 60 This strategy was demonstrated by producing diubiquitin and ubiquitin−α-synuclein conjugates with up to 80% conversion. However, the scope beyond the generation of relatively small ubiquitin-containing conjugates, a field which has been widely studied, has not yet been demonstrated. 62 In addition to solely targeting a single cysteine residue, reactive functionality pairings such as N-hydroxysuccinimide (NHS)-ester/maleimide or N-[ϵ-maleimidocaproyloxy]sulfosuccinimide ester (Sulfo-EMCS)/maleimide that target lysine and cysteine residues, respectively, were developed. 61 These approaches successfully generated protein−protein conjugates; however, multimeric species were also produced due to the nonspecific nature of targeting lysine residues. 63 Nonetheless, this relatively crude linking strategy allowed researchers to investigate the effect of chemically linked Plasmodium falciparum Merozoite Surface Protein-1 conjugates on immunogenicity in mice when compared to monomeric or oligomeric forms. 61 This study highlights that, occasionally, heterogeneous protein−protein conjugates can be used to answer biological questions without resorting to more complex conjugation strategies.

Metal-Mediated Conjugation Strategies.
In the field of site-selective protein modification, metal-mediated crosscoupling reactions on protein substrates are rapidly gaining interest. A range of metal-mediated, cysteine selective, coupling strategies using Pd(II), Au(I/III), Ni(II), and Pt(II)-based organometallic reagents have been developed for the chemoselective modification of cysteine. 64−69 A variety of metalmediated methods have subsequently been used to generate protein−protein conjugates, typically employing a heterobifunctional linking approach.
Of the metal-based conjugation strategies, novel methods used to generate bench stable Pd-protein oxidative addition complexes (Pd-protein OACs) have received the most interest for producing protein−protein conjugates. 70,71 Initial studies site-selectively modified cysteine using a Pd-OAC generated from 1,4-dihaloarene, proceeding through a hypothesized πcomplex intermediate followed by intramolecular oxidative addition to generate the aforementioned Pd-protein OAC ( Figure 6). 70 Upon formation of the Pd-protein OAC, the electrophilic handle installed at cysteine subsequently reacted with a solvent exposed cysteine residue on a second protein, effectively generating heterodimeric conjugates through the formation of stable C(sp 2 )−S bonds. This method is conceptually similar to the previously discussed vinylphosphonite reagent which undergoes sequential activation after reacting with a cysteine residue on the first protein domain. 60 Conversions of up to 79% were successfully achieved, and the method was demonstrated on proteins with molecular weights of up to 83 kDa, clearly displaying the general applicability of this method for cysteine-containing proteins of various sizes.
This approach was applied to smaller synthetic proteins derived from flow peptide synthesis (<100 amino acids). 72 By combining flow synthesis and Pd-mediated cysteine−cysteine conjugation chemistry, a panel of bioactive, covalently crosslinked transcription factor (TF) homo-and heterodimers were generated. These displayed improved stability compared to the corresponding noncovalent complexes and led to inhibition of oncogenic proliferation. This study clearly demonstrates the potential that purely synthetic methods hold in developing active bimolecular protein therapeutics. Interestingly, the same homo-and heterodimeric TFs were produced using a solely automated flow approach developed in parallel by the same group. 73 Although this approach produced similarly bioactive TF homo-and heterodimers which also attenuated oncogenic activity, this method resulted in low yields (14%) when compared to other chemical conjugation approaches. Nonetheless, this research highlights the potential that flow synthesis holds for rapid, high-throughput production of protein− protein conjugates, and over time will undoubtedly be further optimized to achieve improved yields.
Protein−protein conjugates were prepared through lysine residues using a heterobifunctional molecule comprising both an amine-reactive NHS ester and a Pd-OAC functional group ( Figure 6). Initial acylation of lysine residues with the NHS ester was used to install a Pd-OAC group, which could subsequently undergo a second conjugation step with the cysteine residue of a different protein. 71 One downside was the unselective nature of the NHS ester acylation strategy making this approach less useful for generating well-defined protein− protein conjugates. This was demonstrated through the generation of a heterogeneous mixture of RNase A-Pd OAC complexes. 71 Nonetheless, this approach was effectively used to generate protein−protein conjugates with high yields and was even shown to proceed at nanomolar concentrations, highlighting the fast reaction kinetics of the Pd-Protein OACs.
In addition to Pd-mediated conjugation, proteins were also conjugated via cysteine arylation chemistry using bisarylboronic acid homobifunctional polymers and a Ni(II) catalyst. 67,74 This approach was used to produce GFP and T4 lysozyme homodimers with conversions of approximately 50%. 74 In addition, Cu(II)-catalyzed, histidine directed, backbone N−H arylation and alkenylation of proteins with boronic acids was achieved. 75 By sequential addition of a heterobifunctional linker featuring 2-nitro-arylboronic acid and (E)-alkenylboronic acid functionalities, orthogonal Ni(II)promoted cysteine arylation followed by Cu(II)-catalyzed histidine-directed backbone N−H alkenylation was achieved. 76 This strategy allowed heterodimeric protein−protein conjugates of T4 lysozyme and sfGFP to be generated with up to 93% conversion.
As with all approaches utilizing organometallic compounds, there remains the issue of complete removal of metal ions which may chelate to proteins altering their function or causing downstream toxicity. It was noted in the case of Pd-protein OACs that only 90% of the Pd, as determined by inductively coupled plasma mass spectrometry, was removed from the purified conjugates. 70 For metal-mediated protein−protein conjugation to find relevance in the generation of therapeutics, issues with potential metal-mediated toxicity and their complete removal require addressing.

Lysine Targeting Reagents.
Reagents targeting the ϵ-amine of lysine residues are a popular approach to generate bispecific antibodies. The methods are simple to execute and have therefore been widely used, but suffer from issues including unselective labeling at multiple lysine residues which can lead to sample and batch heterogeneity.
A popular approach involves an effective functional group interconversion from a primary amine (lysine) to a thiol. This can be achieved with N-hydroxysuccinimide-succinimidyl-3-(2pyridylthiol)propionate (SPDP), generating solvent exposed thiols on both antibodies after a reduction step (Figure 7). 77,78 This chemistry was applied to two separate antibody domains, and the resulting sulfhydryl-containing proteins were incubated together in a 1:1 ratio to generate bispecific conjugates linked by a disulfide bond. This strategy was used on multiple occasions to develop a range of bispecific antibodies with uses in imaging and as a potential therapeutic for autoimmune thyroiditis. 79,80 This method suffers from the lack of selectivity for homo-versus heterodimerization, resulting in a statistical mixture of products. It is also notable that disulfide bonds formed in bioconjugation can be unstable under physiological conditions, making these conjugates less useful as therapeutic agents. 81,82 Alternative approaches for selectively targeting lysine residues that overcome some of the challenges faced when using SPDP avoided the requirement for disulfide bond formation altogether. The cyclic compound thioimidate 2iminothiolane, commonly known as Traut's reagent, has been used to introduce a free thiol at lysine, which can subsequently react with a maleimide functional group installed at lysine on a second protein using bifunctional succinimidyl 4-(N-maleimidomethyl)cyclohexane-1-carboxylate (SMCC) reagents ( Figure 7). Bispecific antibodies capable of simultaneously directing stem cells to infarcted myocardium and T cells to tumors were generated using this method. 83,84 This linking strategy, combining Traut's reagent and SMCC reagents, was also used to generate therapeutic bispecific antibodies in other preclinical cancer studies. 85,86 Nonetheless, this approach suffers from the previously discussed retro-Michael addition and thiol exchange problem faced by all maleimide−thiol conjugation strategies. 37,39 A further lysine-selective method designed to overcome stability issues arising from disulfide or maleimide linkages was recently described. Benzaldehyde and hydrazine functional groups were introduced using NHS ester conjugation at lysine, and the resulting proteins were mixed in a 1:1 ratio, generating a hydrazone-linked species. 87 This approach was used to generate T cell recruiting bispecific antibodies to cancer cells overexpressing EGFR. However, it should be noted that hydrazones are not completely stable under aqueous conditions and can undergo hydrolysis. 88 3.3. Reagents Targeting Alternative Amino Acid Residues. Targeting amino acids beyond cysteine and lysine when generating protein−protein conjugates is currently an underdeveloped area. This is mainly due to the high selectivity achieved using cysteine-targeting reagents that is rarely possible with other canonical amino acids.
One approach which achieved protein−protein conjugation through alternative nucleophilic residues utilized a photocaged quinone methide functional group linked to an NHS-ester. 89 This approach installed the photocaged quinone functional group onto a protein at a lysine residue via amine acylation. Subsequent UV irradiation generated a highly reactive Michael acceptor in the form of a quinone methide. This intermediate was trapped with any of nine amino acid residues on a second protein domain: Asp, Glu, Lys, Ser, Thr, Tyr, Gln, Arg, and Asn, with Gln, Arg, and Asn being of particular interest as they have rarely been probed using other cross-linking approaches. This method was used to covalently cross-link proteins in vitro and generate protein−DNA cross-links. Although the objective of this research was to study biomolecular interactions through cross-linking and not to generate well-defined conjugates, the promiscuous nature of this approach clearly highlights the challenges that could arise when attempting to generate protein−protein conjugates through alternative amino acid residues.

GENETIC CODE EXPANSION
Protein−protein conjugates have also been prepared by incorporating unnatural amino acids (UAAs) via codon reallocation. Referred to as genetic code expansion (GCE), this process takes advantage of endogenous protein synthesis machinery to incorporate a reactive handle, such as an azide, which is chemically distinct from the functional groups of the 20 canonical amino acids. 90 This enables the use of click chemistry for site-specific protein modification and leads to products with excellent homogeneity. This selectivity, coupled with the diversity provided by over 200 reported UAAs, has led to the utilization of GCE, with a particular focus on bispecific antibodies. 91 The most common GCE-mediated approach for preparing bispecific antibodies is to combine two UAAcontaining proteins with a bifunctional reagent featuring a flexible linker. Linker lengths and conjugation sites can be readily optimized, and this modular approach is amenable to the combinatorial generation of broad heterodimer libraries. 92 A well-studied UAA linking strategy involves the formation of an oxime followed by click chemistry. A notable example of this approach is the coupling of a p-acetylphenylalanine (pAcF) residue with an alkoxyamine bifunctional linker, via a oxime bond, to install either a terminal azide or an alkyne group into anti-HER2 Fabs. Once installed, the Fabs were conjugated with CuAAC. The affinity of the resulting anti-HER2 Fab homodimers was comparable to a full-length IgG and exhibited subnanomolar killing (EC 50 ≈ 20 pM) of HER2 + cancer cells in the presence of human T cells in vitro. 92 This methodology was expanded to generate higher valency IgG and Fab-based bispecific antibodies, including Tri-Fab, Tri-IgG, and Tetra-IgG conjugates, produced in yields of 30%, 25%, and 50%, respectively. 93 Despite being comparable to the most potent bispecific formats, the need for low pH (4.5) and long reaction times (72 h) may restrict the generality of this approach, rendering it incompatible with sensitive protein domains. 92,94 Other strategies utilized heterobifunctional linkers that react selectively with natural amino acids on one protein and unnatural amino acids on another. Such approaches are commonly based on site-selective cysteine chemistry, as highlighted by a aminooxy-maleimide reagent, which was used to conjugate a pAcF-containing Herceptin Fab to Sap 6 containing an engineered cysteine residue. 95,96 However, strategies that are not based on cysteine conjugation also exist. For example, by combining an UAA with a omethoxyphenol side chain, which undergoes site-specific oxidative coupling with an aniline functionality in the presence of NaIO 4 , with a N-terminal selective 2-pyridinecarboxaldehyde moiety, a well-defined dimer of RNase and the p-amino-L-phenylalanine-MS2 viral capsid was generated. 97 UAAs have also been deployed in the generation of fulllength IgG immunotoxins (Figure 8). 13 Immunotoxins are chimeric fusion proteins consisting of a cancer targeting antibody fragment and a bacterial protein toxin, which can be used to kill cancerous cells. 98 For example, Pseudomonas exotoxin (PE), containing the UAA azidophenylalanine was expressed in bacteria and conjugated to a HER2 targeting IgG expressed in a mammalian system and functionalized with a DBCO handle via maleimide chemistry at an engineered cysteine residue. 13 This approach generated immunotoxins with highly target specific cytotoxicity against HER2 + cell lines. Due to the inherent toxicity of PE to eukaryotic cell lines, production of full length IgG immunotoxins via traditional fusion methods has not been successful to date. 13 The ability to independently express both IgG and PE domains in separate cell lines prior to chemical generation of immunotoxins clearly highlights the benefits of post-translational conjugation methods in generating otherwise inaccessible protein−protein conjugates. Although these methods achieved site-specific protein−protein conjugation, the reagents necessitated a twostep conjugation process; after the first conjugation reaction, excess reagents were removed by dialysis or affinity purification. 99,100 This drawback led to the exploration of alternative GCE-based protein conjugation methods.
Inspired by examples found in nature, attempts have been made to prepare linker-free protein−protein conjugates, referred to as direct protein−protein conjugation. 101 Given this strategy involves a single site modification, protein− protein conjugation can potentially be achieved in a single-pot reaction with minimal impact on protein structure and function. Initial efforts utilized p-propargyloxyphenylalanine and p-azidophenylalanine side chains, which were coupled via CuAAC. Despite achieving protein−protein conjugation, the application of this method was limited due to the requirement of a cell-free protein expression system which resulted in low protein yields and, in addition, Cu-induced protein damage. 101 In response to these limitations, a high yielding conjugation method was developed that incorporated an azide-containing amino acid into one protein and a BCN-containing amino acid into the other. This allowed for two proteins, glutathione Stransferase and a maltose-binding protein, to be conjugated via a SPAAC reaction, requiring no additional reagents. 99 Despite these advances, the most prominent drawback of UAA mediated protein−protein conjugation is the efficiency with which the UAA can be incorporated, and in particular how the sequence context surrounding the in-frame UAG codon can restrict or prevent incorporation at particular sites. 102 However, new codon reassignment technologies, such as the one used to incorporate N ϵ -(o-azidobenzyloxycarbonyl)-L-lysine, are enabling the development of UAA-based protein− protein conjugation. 17,103

ENZYMATIC METHODS AND TAG ENGINEERING
Recent years have witnessed increased interest in enzymatic and tag-based methods for protein modification. 104,105 Such methods offer mild conditions and excellent site-specificity as a result of the enzymes used, but do require larger "sequence tags"�a specific sequence of amino acids�to be incorporated via recombinant protein expression.

Sortase-Based Approaches.
One of the first examples using an enzymatic method to prepare a protein− protein conjugate employed sortase, a prokaryotic enzyme that catalyzes amide bond cleavage of a C-terminal LPXT↓G motif and transfers the remaining N-terminal polypeptide sequence to an appropriate nucleophile (Figure 9). Polyglycine sequences are privileged in their role as nucleophiles in this transpeptidation reaction, effectively enabling coupling between proteins bearing the sortase tag at the C-terminus and a polyglycine tag at the N-terminus of two protein coupling partners. In a proof-of-concept study, this method was used to prepare dimers of GFP. 106 A sortase-mediated approach was used to couple the SRC Homology 3 (SH3) domain, which is insoluble at physiological pH, with the B1 domain of Protein G (GB1), thus endowing it with suitable solubility properties for structural characterization by NMR spectroscopy. 107 The study also discovered that yields and reaction times were improved by performing the coupling under conditions of dialysis, a result arising from removal of the cleaved glycine-containing peptide from the reaction equilibrium. 107 Sortase-based methods were also used to generate N-to-N and C-to-C linked protein−protein heterodimers, products that are impossible to generate using genetic approaches alone. This method relies upon the sortase-mediated installation of bioorthogonal click handles to either the N-or C-terminus, followed by SPAAC (Figure 11). 12 This hybrid approach using sortase-mediated bioconjugation followed by click chemistry was used to prepare a library of heterodimeric protein conjugates based on the individual ligands neuregulin-1β (NRG) or epidermal growth factor (EGF). 108 Each of the proteins intended for conjugation were functionalized with either a tetrazine handle at their C-terminus or a norbornene functional group at their N-terminus; subsequent mixing resulted in the formation of the desired heterodimers. 108 A similar hybrid approach was used to prepare bispecific antibodies with broad anti-influenza virus activity. 109 In this system, sortase-mediated conjugation was used to append DBCO and azide functional groups to the C-termini of two different IgGs. Interestingly, the addition of the DBCO functional group to the C-terminus was less efficient than the azide; this was suggested to arise from the promiscuous reaction of DBCO with free thiol groups that exist in both antibodies and sortase. 109 Nonetheless, upon mixing of the orthogonally tagged coupling partners at 20°C the desired bispecific antibody formed and displayed excellent stability, with >90% remaining after 3 weeks at 37°C in IgG-depleted human serum. 109 This chemo-enzymatic approach was also applied to the preparation of a bispecific antibody that recruits T cells to acute myeloid leukemia (AML) cells. An antibody and scFv domain were conjugated using sortase-mediated addition of tetrazine and TCO functional groups to the respective coupling partners. 110 Building upon this hybrid approach, and using more recent knowledge that simple alkylamines can be substituted for the polyglycine motifs often used for sortase-mediated conjugation, protein−protein dimers and tetramers were produced. 111,112 The conjugation strategy first employed the introduction of appropriate click handles (DBCO and azide, respectively) at the C-termini of the nanobody Ty1. Simple treatment of the bioorthogonally tagged proteins produced the Ty1−Ty1 homodimer, while the use of a tetra-azide reagent with excess DBCO-tagged Ty1 produced the homotetramer (Ty1) 4 . Because the Ty1 nanobody binds to and neutralizes the SARS-CoV-2 virus, producing the (Ty1) 4 homotetramer achieved an IC 50 value in the low picomolar range. 112 Sortase-based methods have developed to the point where increasingly ambitious applications have started to emerge. A library of bispecific binding proteins was generated from two orthogonal sets of sublibraries, one of which comprised proteins bearing a sortase-tag followed by a His 6 -tag at their C-termini. A second library consisted of proteins bearing a Gly 5 -tag followed by a tobacco etch virus (TEV) protease cleavage site at their N-termini. Treatment of the combinatorial library with TEV protease for revealing the N-terminal Gly 5 -tag and sortase for protein−protein coupling led to the formation of the desired protein−protein heterodimers. Screening the library of bispecific antibodies led to the identification of both known (in protein fusion format) and unknown bispecifics that caused changes in cell proliferation in two cell lines of relevance to breast cancer. 113 Sortase-based methods for preparing protein−protein conjugates are becoming some of the most widely applied, in particular for the addition of orthogonal functional groups that enable protein−protein conjugation via click chemistry. Recent studies have shown that the C-terminal LPXTG acceptor motif works for not only N-terminal polyglycine motifs and primary amines but also an engineered internal sequence (YKPH) which opens the door to nonlinear protein−protein conjugates using sortase-based methods. 114 Finally, incorporation of a cysteine residue before the C-terminal sortase sequence tag enabled both protein−protein conjugation and protein− fluorophore conjugation. 115

Approaches Mediated by Tyrosine Oxidation.
Recently, tyrosine has garnered interest as a target residue for producing protein−protein conjugates. In the presence of a tyrosinase, the phenol side chain is oxidized to an orthoquinone functional group which forms the basis for further elaboration (Figure 10). A "knob-in-hole" antibody that features a G 4 Y motif at one of the C-termini was first oxidized with mushroom tyrosinase (mTyr) to provide an orthoquinone group that can undergo a strain-promoted cycloaddition with BCN-functionalized proteins, including the cytokine IL2 or a short-chain variable fragment (scFv). 116 The BCN-functionalized coupling partners were themselves prepared using a sortase-mediated conjugation, highlighting the potential of combining conjugation strategies to provide protein−protein conjugates.
The C-terminal Y-tag G 4 Y has also been targeted with other oxidizing enzymes for protein−protein conjugation. 117 Treatment with enzymes such as laccase or horseradish peroxidase in conjunction with H 2 O 2 can lead to the formation of protein−protein heterodimers including an IgG partner, although higher order oligomeric species have also been observed. 118−121 Later work discovered that the ortho-quinone functional group generated upon tyrosine oxidation undergoes a reaction  with the sulfhydryl group of a free cysteine residue. 122,123 This approach was used to prepare conjugates of sfGFP with three different proteins: CRISPR-Cas9, a HER2-binding scFv, and nanoluciferase. 122 Further work explored the use of different tyrosinases to expand the scope of tyrosine residues that could be targeted in this manner. 19 A tyrosinase from Bacillus megaterium (megaTYR) was found to be more promiscuous and enabled the oxidation of tyrosine residues in a high number of sequence motifs, as assayed by peptide experiments. Further engineering of megaTYR led to a variant that displayed high activity toward tyrosine residues in the E 4 Y motif, ultimately enabling the construction of a linear triple-protein conjugate comprising nanoluciferase, GFP, and mCherry. One limitation of this approach, namely direct residue-to-residue conjugation, is that it does not present an opportunity for a longer linker between two proteins. Such linkers are easily achieved using traditional linker-based approaches, which can be obtained using the chemical cross-linking strategies described in Section 3. This limitation could become problematic when optimizing the binding of a bispecific antibody to an antigen.
The enzyme tubulin tyrosine ligase (TTL) appends a tyrosine residue to an α-tubulin derived C-terminal recognition sequence (Tub). In a process conceptually related to metabolic engineering, a protein of interest bearing the Tub sequence was exposed to an analogue of tyrosine that contained either an azide or alkyne group on the phenyl ring ( Figure 11). 124 In the presence of TTL, the modified tyrosine was ligated to the Cterminus, producing a protein with a bioorthogonal handle that could be used in a subsequent protein−protein conjugation step using CuAAC. This procedure was used to prepare a protein−protein homodimer of the GFP-binding protein (GBP) in ∼50% conversion after 90 min. The coupling of a trastuzumab derived single chain variable fragment (TscFv) and GBP produced a protein−protein heterodimer in 62% yield after purification by size exclusion chromatography (SEC). Fluorescence microscopy showed that this heterodimer could recruit GFP to the plasma membrane of cells overexpressing the HER2 receptor, a target of TscFv.

Other Examples.
Recently, a new enzymatic method emerged for protein−protein conjugation that requires a minimal sequence tag for enzyme recognition (IKXE). The E2 small ubiquitin-like modifier (SUMO)-conjugating enzyme, Ubc9, catalyzes the formation of an isopeptide bond between the lysine residue in the recognition sequence and a protein bearing a C-terminal thioester. This strategy was used to prepare protein−protein conjugates of α-synuclein with either ubiquitin or ISG15, the latter having the same C-terminal peptide sequence as ubiquitin. Protein−protein conjugation via formation of isopeptide bonds offers the opportunity to place the site of conjugation at different parts of the protein sequence, but further work is required to expand the scope of proteins that can participate as the coupling partner beyond ubiquitin-like proteins for a general way to produce protein− protein conjugates. 125 C-to-C terminal protein−protein conjugation was also achieved by using an engineered asparaginyl ligase and a short bifunctional linker peptide that adds to proteins with a C-terminal sequence tag (NGLH). 126 Although not strictly an enzymatic method, some protein− protein dimers have been prepared using intein-based methods. Inteins are peptide sequences that can be induced to cleave with concomitant ligation of the flanking peptide sequences (exteins). The pathway proceeds via a thioester intermediate which can be potentially intercepted and used as a component in native chemical (NCL) with another protein bearing a N-terminal cysteine residue. 127 In series, this process is referred to as express chemical ligation (EPL) and has been used for the preparation the protein−protein conjugate of histone H2B and ubiquitin featuring either the natural isopeptide linkage or a synthetically more tractable disulfide analogue. 128−130 In addition, Ub−Ub homodimers have been produced for NMR studies. 131 The formylglycine-generating enzyme (FGE) has been used to convert cysteine residues contained in a CXPXR motif into a formylglycine residue, which bears a reactive aldehyde group Figure 11. Enzymatic approaches to install click chemistry functional groups for protein−protein conjugation. in its side chain. Treatment of the aldehyde-containing protein with a bifunctional small molecule bearing (i) an aminooxy group for reaction with the formylglycine group and (ii) either an azide or alkyne group for further modification using click chemistry. Following independent preparation of azide-tagged and alkyne-tagged proteins, heterodimerization was achieved to provide full length human IgG (155 kDa) conjugates with either human growth hormone (26 kDa) or the maltosebinding protein (42 kDa). 100 Recently, it was found that FGEmediated protein−protein conjugation could be accelerated by freezing, an effect ostensibly attributed to extreme changes in pH, ionic strength, and liquid water concentration as ice crystals form. 132 A recent report exhaustively tested different bioorthogonal coupling partners, which were enzymatically installed, for protein−protein conjugation. 21 The enzyme lipoic acid protein ligase (LAPL) recognizes the 13-residue LAP sequence ( Figure  11) and catalyzes the formation of an isopeptide bond between an internal lysine residue in the LAP sequence and a carboxylic acid bearing group in a small molecule probe. The 14 probes used in the study each featured bioorthogonal functional groups and enabled screening of several well-known click reactions leading to protein−protein conjugates. An optimal pairing was found with a tetrazine and strained cyclooctyne (TCO) operating under an IEDDA mechanism with an approximate second-order rate constant of 50 M −1 s −1 at 37°C in phosphate buffered saline.
This method was ultimately used to prepare a triple-protein conjugate of trastuzumab; the LAP-tag was added to each of the heavy chain C-termini and the tags were functionalized with GFP. Remarkably, the reaction was quantitative after 4 h using two stoichiometric equivalents of GFP and the product maintained low-nanomolar binding to HER2 + cells. 21 5.4. SpyTag/SpyCatcher-Based Methods. The SpyTag/ SpyCatcher system is not an enzymatic process, but is a popular method for preparing protein−protein conjugates and involves a sequence tag. The system emerged from studies of the second immunoglobulin-like collagen adhesion domain (CnaB2) from the fibronectin binding protein (FbaB) found in Streptococcus pyogenes (Spy, Figure 12). 133 The CnaB2 domain is exceptionally stable, remaining folded after boiling at pH 2, and harbors an isopeptide bond between an aspartic acid and lysine residue. The SpyTag/SpyCatcher conjugation system was designed by splitting the CnaB2 Spy domain into two portions at this isopeptide bond: a tag comprising 13 amino acids (SpyTag) and the remaining protein sequence (SpyCatcher, 13 kDa). Upon mixing of the two fragments, the original CnaB2 domain is rapidly reconstituted. Expression of the SpyTag and SpyCatcher domains into two different proteins results in them forming a stable protein−protein conjugate, linked by the CnaB2 domain.
The SpyTag/SpyCatcher system was used to prepare bispecific antibodies that recognize two different domains of the transmembrane protein roundabout homologue 1 (ROBO1). The protein components that underwent conjugation were each scFv fragments, and the approach ultimately led to a tetravalent bispecific antibody (two scFv fragments per molecule) that displayed midrange picomolar affinity for its target where individual components bind in the mid-to-low nanomolar range. 134,135 Later work used the SpyCatcher/SpyTag system to build anti-HER3 antibodies from individual building blocks (Fc, Fab, and scFv regions) that each featured an appropriate Spy-based domain for conjugation. 136 A follow-up study produced a trivalent scFv; the central development that enabled this was synthesis of a peptide comprising three consecutive SpyTag sequences. Treatment of this peptide with an scFv-SpyCatcher fusion protein resulted in the desired anti-HER3 trivalent scFv. 137 An interesting practical development was recently disclosed that resulted in fewer protein purification steps. Individual proteins bearing respective SpyCatcher and SpyTag domains were expressed in HEK 293F cells; combining the cell culture media at 37°C for 3 h gave the desired protein−protein conjugate which activated the canonical Wnt signaling pathway. 138 Further practical improvements were made by developing a protease-knockout variant E. coli strain that permits the expression of Spy-tagged Fabs into the periplasm. This enabled Spy-tagged antibody fragments to be used in a modular fashion, with examples such as coupling to Fc regions and enzymes. 139 Expression of Spy-tagged proteins has also been achieved in silkworms. 140 Additional applications of protein−protein conjugates generated using the SpyCatcher/SpyTag system include bispecific immune engagers, antibody−enzyme complexes, and bispecific antibodies. 141−143 A recent DogTag/DogCatcher pair enables this strategy to be applied to internal tag sequences in a loop-friendly manner. 144 Fast progress has been made, although it was been pointed out that, as a domain of a Streptococcus surface protein, SpyCatcher is expected to induce a strong immune response. 139 This could mean that while the Spy-system is well suited to screening and development, it may need to be replaced by a different conjugation system for therapeutics based on protein−protein conjugates. 139

OUTLOOK
Although post-translational approaches for generating protein−protein conjugates have been investigated for several decades, early techniques lacked the site selectivity required to produce well-defined protein−protein conjugates and, therefore, did not endure with the advent of powerful genetic engineering methods for producing fusion proteins. However, the past two decades have witnessed significant progress in site-selective and site-specific conjugation methods, enabling the preparation of protein−protein conjugates with greater control. Figure 12. Design of SpyCatcher/SpyTag system and application to preparation of bispecific antibodies.
In general, utilizing post-translational chemical conjugation methods can overcome many of the challenges associated with fully expression-based, fusion methods. In particular, the ability to achieve any desired topological arrangement of the target conjugates (N-to-N, C-to-C, internal-internal) obviates the requirement for N-to-C terminal ligation imposed by fusion methods. Additionally, proteins can be conjugated at a "late stage" in the overall preparation, leading to the development of some combinatorial techniques. 92,113 Within the broad field of site-selective bioconjugation, cysteine remains an essential target, and so cysteine-specific reactions are a dominant subfield of protein−protein conjugation. The low abundance and high nucleophilicity of cysteine make it the target of choice for many strategies, from traditional bismaleimide reagents to novel metal-mediated cross-coupling, as well as in some enzymatic and sequence tagbased approaches. Furthermore, even amine-selective strategies, such as those using SPDP or Traut's reagent, ultimately use the reactivity of thiols to generate the final protein−protein conjugates. This is evidence that cysteine and thiols in general are the favored reactive handles, and it is unlikely that this status quo will change in the near future. However, as more site-specific conjugation strategies targeting other canonical amino acids become available, perhaps a gradual shift away from the reliance on thiol-based methodologies will occur.
Another strength of chemical-based approaches is the diversity of linker motifs that can be generated with variation in chain length, flexibility, hydrophilicity, and their ability to be orthogonally functionalized with small molecules. Although some of these properties can be varied to some extent with peptide-based linkers using recombinant expression, each variation requires starting at the genetic level which can be both challenging and costly. On the other hand, chemical linkers present a wider design space than peptide-based linkers and libraries of pregenerated linkers can be produced and utilized in parallel with the same protein domains (which only require expression once). Studies on "linker-ology" in the field of PROTAC design suggest that exploring linker space in protein−protein conjugates is currently underdeveloped. 145 Linker design could, for example, play an important role in producing bispecific antibodies by enhancing antigen-binding properties in vivo.
A potential limitation of post-translational linking strategies that requires further exploration is whether certain linkers produce immunogenic properties. Several protein−protein coupling reactions utilize functional groups that are not found in nature, such as DBCO or TCO, and the presence of these moieties as their respective click products could potentially lead to a larger immune response than PEG-or peptide-based linkers. In terms of applications, post-translationally generated protein−protein conjugates have typically been limited to extracellular roles. This is in contrast to genetic fusion proteins, which can be expressed intracellularly in vivo and therefore used to study a wide range of biological functions.
An issue encountered in the course of preparing the present article was the inconsistency as to how conversions and yields were quoted. A consensus needs to be reached in the field on how best to quantify the conjugation efficiency of this class of reactions. We propose that a gold standard would be to quote the isolated yield after an appropriate purification step, such as size exclusion chromatography, of the protein−protein conjugate. While conversion remains a useful descriptor for reaction screening and optimization, it should not be provided as the sole measure of a given conjugation process. Use of a battery of techniques, including circular dichroism and functional assays, should be included in addition to standard analyses via SDS-PAGE or LC−MS. Reporting in line with these principles will allow researchers to select the conjugation technique that is most suitable for their specific requirements. In addition, comparisons between post-translational conjugation strategies (e.g., site-specific cysteine chemistry vs enzymatic conjugation) are typically not performed, and it is unclear whether a particular approach may present a general advantage over another.
Nonetheless, great strides have been taken in developing post-translational strategies for producing protein−protein conjugates. With this in mind, a fruitful period for the field is anticipated, in which many more strategies will be developed and the existing toolbox will be widely utilized to generate protein−protein conjugates that can be used to probe important biological questions.